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Airline networks are methodically designed, engineered systems with structures that can vary con- 
siderably amongst distinct carriers. We analyze the flight networks of the seven largest passenger 
carriers in the USA, characterizing their topological structures and resilience properties. Struc- 
turally, we find the degree distribution of several of the networks, including the aggregate over the 
seven carriers, are well described by simple exponential distributions. Functionally, we find that 
networks with "large" k-core structures possess extreme resilience to both random and targeted 
removal of either airports (nodes) or flight paths (edges), with no significant increase in estimated 
travel time. Similar results are obtained when the targeted removal of airports is by degree or by 
betweenness, albeit the effect of the latter causes a faster breakdown of each carrier's network. We 
introduce a rewiring scheme that preserves total number of daily flights and gate requirements while 
enhancing k-core structures and resilience (i.e., k-core resilience), which should augment our under- 
standing of building resilient networks in general. Finally, our findings suggest that point-to-point 
topologies have larger k-core structures, providing new insight into the long standing debate on the 
optimality of such layouts when compared to hub- and- spoke arrangements. 



Modern civilization relies on the efficient design, op- 
eration and maintenance of a variety of, often interde- 
pendent, infrastructure networks, with aviation networks 
being an archetypical example. Air travel is a principal 
means of fast and effective transportation of people and 
goods over large distances across countries or continents, 
around the globe. It is critical to the functioning of coun- 
tries and the world economy as a whole. The aggregate 
network of air travel worldwide built by considering all 
flights amongst all destinations throughout the globe (the 
World Airline Network) has been the subject of much re- 
cent study [TJ 2 , 3||H[5]. The focus has been on analysis of 
overall flow patterns and the consequences for the spread 
of global epidemics [4 , as well as identifying the overall 
importance of individual airports [5]. An aggregate level 
analysis has also been carried out on the airline networks 
of a few individual countries which show several similar- 
ities to the WAN, namely "scale- free" and small- world 
characteristics [6j [7] . 

Our interest is not in overall flow, but in design and op- 
eration of critical infrastructure. The aggregate view of 
air travel is built up from a collection of co-existing airline 
networks, operated independently by distinct entities. 
Each independent operator must build a well-connected 
and economically successful airline network which is re- 
silient to random or systematic vagaries, ranging from 
acts of nature to terrorism. Furthermore, an individual 
airline has direct control only over their own network. 
We investigate what changes to network structure of an 
individual carrier can lead to improved efficiency and re- 
silience. 
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Herein we analyze and contrast the network structures 
of the seven largest passenger airlines in the United States 
of America (USA). When broken down into networks of 
individual carriers, we show that many features differ 
from those of the aggregate view. Small-world attributes 
are exhibited by the networks of all the carriers, yet, 
rather than scale-free power law distributions, we find 
that the distribution in airport connectivity is better de- 
scribed by either a simple exponential decay or a cumu- 
lative log-normal distribution. More pronounced than 
distribution in connectivity, we find that Southwest Air- 
lines (SW) stands apart from the other six carriers by its 
k-core structure (defined in detail below) and its extreme 
resilience to random or targeted deletion of nodes (air- 
ports) or edges (flight paths). Edge deletion corresponds 
to, for instance, weather preventing travel between two 
airports, while node deletion corresponds to temporary 
closure of an airport. SW has essentially built a core 
network, comprising more than half of its overall desti- 
nations, which is a dense mesh of interconnected high- 
degree (i.e., "hub") airports. We explore the interplay 
between placing hubs in the periphery versus the core 
of a network and introduce a general network rewiring 
process which keeps constant the demand on each node 
and the amount of flow between nodes, that enhances the 
k-core structure and increases resilience of a network. 

One fundamental consideration when building a new 
airline network, or expanding an existing one, is whether 
to prefer "point to point" (PP) or "hub and spoke" (HS) 
connectivity. Of course, aside from functional considera- 
tions, there are also financial reasons why one type may 
be preferred over the other. In the PP scenario, a pas- 
senger can travel on a direct non-stop flight to a range of 
destinations at shorter distances, but to travel consider- 
able lengths has to transit and take multiple flights. In 
the HS scenario, in contrast, a passenger can travel non- 
stop only to a few central hubs, and from there transit to 
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their final destination (almost always requiring two-hops 
unless the hub is their ultimate destination). Rigorous 
analysis shows asymptotic optimality of HS models for 
spatial transportation networks with transfer costs [8]. 
Analytic arguments, backed by numerical simulations in- 
dicate that HS architectures are optimal for travelers 
wishing to minimize the number of connecting flights re- 
quired instead of overall distance travelled [9 j. Inspired 
in part by studies on airport networks, a general model 
of weighted networks via an optimization principle was 
proposed in which a clear spatial hierarchical organiza- 
tion, with local hubs distributing traffic in smaller re- 
gions, emerges as a result of the optimization [TO] . Thus 
there seems to be a growing consensus in the literature 
regarding HS structures arising out of optimization of re- 
sources. However, real- world structures also need to be 
resilient and robust apart from having an optimal distri- 
bution of resources. In this work, we show that PP struc- 
tures can be much more resilient than HS structures. 

The majority of the larger airlines operating in the 
USA at present predominantly follow the HS pattern. 
This was not the case prior to 1978, when the USA Fed- 
eral Government regulated air traffic, with special atten- 
tion paid to ensure lower traffic (and hence lower-profit) 
routes were not ignored [11] . Such regulations effectively 
enforced PP architectures. Once the Federal Government 
deregulated the airline industry in 1978, most airlines 
gradually shifted to their current HS pattern, apparently 
finding such a HS architecture more desirable. A signifi- 
cant exception was Southwest Airlines (SW), which con- 
tinued to build a PP system. As of the end of 2007, SW 
is the largest airline (by both number of domestic pas- 
sengers and domestic departures) not only in the United 
States, but also in the entire world [12] . Its sheer size 
together with the extremely consistent economic success 
of SW [13] provide strong evidence for the efficacy of 
PP networks. Ryanair and Easyjet are two examples of 
successful PP carriers in Europe [14 . Innovative man- 
agement policies have played an important part in the 
success of SW and are studied extensively in business 
literature (for instance, Ref. Q3]). Here our focus is on 
network infrastructure with a view to efficiently design or 
restructure individual airline networks so they are well- 
connected, robust and resilient to disturbances. 

Our findings are of theoretical interest yet should also 
be relevant to entities engaged in designing or altering 
large-scale airline networks. For instance, expanding air- 
line networks in developing nations need to consider the 
tradeoffs between PP and HS architectures. Airline car- 
riers wanting to shrink an airline (i.e., eliminate flights 
with minimal impact), for instance as rising fuel prices re- 
quire enhanced operating efficiency, need systematic ap- 
proaches for identifying the appropriate manner. Finally, 
individual carriers need metrics to assess the quality of 
network infrastructure resulting from a merger with an- 
other carrier. 



CONSTRUCTING THE AIRLINE NETWORKS 

All certificated U.S. air carriers are required to file 
monthly reports with the U.S. Department of Trans- 
portation, Bureau of Transportation Statistics, detail- 
ing information on every flight segment flown during 
that month. This information is maintained in a pub- 
lic database, the "Air Carrier Summary: T-100 Domes- 
tic Segment (U.S. Carriers)" [16 . From this database 
we download information on every "scheduled passenger 
service" class flight segment flown by each of the seven 
largest U.S. passenger carriers for the entire 2007 calen- 
dar year. To isolate the structure of passenger carriers 
we neglect the small fraction of flights by these carri- 
ers which are designated by the "cargo" (only) class or 
"non-scheduled passenger service" (charter) class. Yet, 
in order to compare the structure of a passenger carrier 
to a cargo-only air carrier, we also download all flights 
flown during the 2007 calendar year by two cargo-only 
carriers (Federal Express and United Parcel Service). 

The seven largest US passenger airlines (by number of 
passengers flown) are in order, Southwest (SW), Amer- 
ican Airlines (AA), Delta (DL), United Airlines (UA), 
Northwest (NW), US Airways (US), and Continental 
(CO). These seven carriers account for 61.6% of all pas- 
sengers enplaned in 2007. We construct two distinct 
views of the network for each carrier, one which captures 
the connectivity (i.e., which airports are connected via di- 
rect flights) , the other captures both connectivity and the 
total traffic flow between airports. For carrier c we denote 
the first view by G C (N C , E c ), and the latter W C (N C , E c ). 
The vertices, N c , are the same in both views and are the 
set of all airports listed as an origin or destination airport 
for carrier c which are also included in that carrier's list 
of official domestic destinations as stated on June 2008. 
This additional data "scrubbing" step eliminates airports 
used only for diverted aircraft which have substantially 
fewer numbers of flights than the official airports and oth- 
erwise introduce noise. We first construct the complete 
flight history for 2007, W C (N C ,E C ). A directed edge is 
added from each origin airport to its destination airport, 
with edge weight equal to the total number of flight seg- 
ments from that origin to that destination flown by car- 
rier c in 2007. The unweighted (binary) version of this 
graph is G C (N C , E c ), and is the equivalent of the "route 
map" for that carrier. It is the collection of airports ser- 
viced, with an edge between two airports if there is a 
direct connection between them. The G C (N C , E c ) view 
focuses only on the connectivity of the network, while the 
W C (N C , E c ) includes also the actual traffic along those 
connections. 

We consider both node degree and strength. The out- 
degree of node z, q° ut , is the number of edges originat- 
ing at that airport in G C (N C , E c ) (number of distinct 
destinations that can be reached directly from i). The 
in-degree, qf 1 is the number of edges terminating at i 
(number of distinct incoming origins). We find qf 1 ~ q° ut 
(airports are almost always connected in both directions) 
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so simply denote node degree as We also consider 
the "strength", Si of the z'th node, defined as in Ref [3]. 
The in-strength (out-strength) of an airport is the to- 
tal number of flights landing (departing) there, for that 
specific carrier, in 2007. Formally, the in-strength (out- 
strength) is the sum over all edge weights in W C (N C , E c ) 
for edges terminating (originating) at that node. We find 
s l n ~ s° ut ] so for the remainder we treat all edges as undi- 
rected and set the undirected edge weights in W c to be 
the maximum edge weight in either direction. 

In addition to the network structures of individual car- 
riers, the aggregate airline network of the USA is of in- 
terest. We study three different aggregate views: Agg7, 
which is the aggregate over the seven largest passenger 
carriers; AggPass, which is the aggregate over all "sched- 
uled passenger service" class flights flown during 2007 by 
all carriers (not just the seven largest); finally, AggAll 
is the aggregate over every single flight segment flown 
during 2007, regardless of service class or carrier. For- 
mally, to construct the distinct aggregate views we take 
the union over all nodes and edges for the set of carriers 
involved: G Agg (N Agg , E Agg ) where N Agg = (J c N c and 
E Agg = \J C E C and W Agg (N Agg , E Agg ), where E Agg is 
the sum over all the corresponding edge weights. Finally, 
in light of a recent merger between two of the carriers we 
study (NW and DL) [17 we construct the merged net- 
works G NW+DL and W NW+DL . 



INDIVIDUAL VERSUS AGGREGATE AND 
USEFUL DISCRIMINATORS 

We first compare and contrast the network structures 
of the distinct airlines. Results are summarized in Ta- 
ble [I] with the passenger airlines listed first, in order of 
increasing number of distinct airports serviced (N). The 
number of distinct direct connections between airports 
for that carrier is E (the total number of edges in the un- 
weighted, binary view G C (7V C , E c )). Included in the table 
also are the results for the three different aggregate views 
(Agg7, AggPass, and AggAll), the two cargo carriers Fed- 
eral Express (FX) and United Parcel Service (UPS), and 
the tentative "NW+DL" network. The average airport 
degree, denoted (q), is simply (q) = 2E/N. The average 
shortest path length over all source-destination pairs is 
denoted (I). This is the average number of flight seg- 
ments required to fly from any airport in the network to 
any other. The average value of betweeness centrality [18] 
is denoted (b). The average clustering coefficient [19] is 
denoted (C). 

For comparison, we generate a corresponding Erdos- 
Renyi (ER) random graph for each carrier, using that 
carrier's N and E values. Results are in Table [Til Note 
the values of (I) and (b) for the actual carriers agree ex- 
tremely well with the values for the corresponding ER 
realizations, strongly suggesting that density alone de- 
termines these two properties. However, all remaining 
properties show significant differences between the real 



networks and ER equivalents. Furthermore, all carriers 
have (I) < InN and values of (C) > (Cer), thus can be 
considered "small- world" networks. 

To quantify the extent to which a network follows the 
"hub and spoke" (HS) pattern, the degree assortativity 
coefficient [20], r, seems to be a natural choice from a net- 
work theory perspective, r > indicates a tendency of 
high-degree nodes to connect to other high-degree nodes, 
r < indicates a tendency of high-degree nodes to con- 
nect to low-degree nodes. Intuitively, a larger negative 
(dissassortative) value of r should indicate that the net- 
work follows the HS paradigm more closely. Previous 
studies have found the airport networks of China and 
India and the airline networks of European carriers to 
be strongly disassortative (Refs. [6j [7J [21] respectively), 
while in contrast the WAN shows assortative behavior [3] . 
As can be seen in Table [I] we find that all the individual 
carriers have dissassortative structures. The value of r for 
SW is about half the magnitude of the other passenger 
carriers, supporting the widely held understanding that 
SW has a predominantly PP structure while the other 
carriers considered here have a more HS topology. Yet 
the value of r for FX is substantially smaller in magni- 
tude than that for SW, seemingly indicating a lack of HS 
structure though the topology of FX exhibits strong HS 
structure. In this context, we turn to a measure used in 
the transportation literature [22] to quantify the extent 
of HS structure, the Gini coefficient [23 . The degree Gini 
coefficient, G(q), is defined for a network of size N as, 



G(q) = 



E»=iEj=i h-Qj\ 

N 2 (q) 



(1) 



where (q) = 2E/N. It essentially measures the magni- 
tude of the difference in node degree between all pairs 
of nodes in a network normalized by average node de- 
gree. Similarly, the strength Gini coefficient G(s) is 



defined as G(s) = ££=i \ Si - Sj \)/(N 2 (s», with 

(s) = (^2iLiSi)/N. As seen in Table m the Gini coeffi- 
cient metric correctly indicates that FX has a strong HS 
structure, whereas the value of r misleadingly indicates 
a strong PP structure. Likewise, for AggPass and Ag- 
gAll (which have strong HS topologies) the values of r 
mistakenly indicate strong PP structures, while the val- 
ues of G(q) correctly indicate strong HS structures. In 
contrast to r, for all the networks analyzed herein, G(q) 
and G(s) consistently and unambiguously differentiate 
between the HS and PP structures. (Note, due to compu- 
tational constraints we could not calculate G(s) for Ag- 
gAll and AggPass.) The Gini coefficient has been widely 
used in economics [23], ecology [24] etc. and, as found 
herein, network studies may benefit from inclusion of this 
metric. Interestingly, SW has a value approximately half 
the magnitude of the other carriers for both r and G(q). 

We carried out a detailed analysis of betweenness cen- 
trality [18] in the manner of Ref. [5 , for all the passen- 
ger airlines. For a few airlines, we do find examples of 
airports where the betweeness of the nodes is relatively 
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TABLE I: Basic network properties of the carriers. N and E denote the number of nodes and edges respectively, and (q) 
the mean node degree. (1) , {b} and (C) denote the mean of the geodesic, betweenness, and clustering coefficient distributions, 
r, G(q), G(s) denote assortativity and degree and strength Gini coefficients. a(q) is the skewness of the degree distribution. 
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< C > 
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G(a) 
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SW 


64 


892 


27.88 


1.542 


0.0091 


0.731 


-0.177 


0.254 


0.490 
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US 


96 


556 


11.58 
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UA 
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-0.320 


0.498 
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AA 
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19.22 


1.889 


0.0076 


0.646 


-0.280 


0.461 


0.794 
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NW 


132 


753 


11.41 


2.023 


0.0080 


0.624 


-0.269 


0.493 


0.760 


2130.143 


DL 


133 


906 


13.62 


1.943 


0.0073 


0.586 


-0.272 


0.499 


0.790 


2168 703 


NW+DL 


163 


1529 


18.76 


1.985 


0.0062 


0.617 


-0.256 


0.497 


0.767 


2682.666 


UPS 


107 


606 


11.33 


1.929 


0.0090 


0.620 


-0.249 


0.427 


0.628 


1618.748 


FX 


334 


1355 


8.11 


3.060 


0.0062 


0.579 


-0.047 


0.548 


0.696 


1457.096 


Agg7 


197 


3505 


35.58 


1.926 


0.0048 


0.710 


-0.244 


0.497 


0.760 


2993.068 


AggPass 


817 


9688 


23.72 


3.181 


0.0027 


0.639 


0.185 


0.630 




8758.680 


AggAll 


1258 


17437 


27.72 


3.005 


0.0016 


0.557 


0.097 


0.677 




17484.512 



higher in comparison to their degree (e.g., IAH for CO, 
PHX for US, STL for AA and LAX for DL). However, 
this mismatch is not as strongly disproportionate as that 
of say the Anchorage airport in Ref. [5 . Hence, we would 
like to classify our observation as "weak anomalous cen- 
trality". 

We analyze the distribution of node degree and node 
strength, with p(q) the observed probability of a carrier 
having a node of degree q and p(s) the observed proba- 
bility of having a node with strength s. These raw proba- 
bility distributions for our networks are extremely noisy, 
thus we construct the complementary cumulative distri- 



TABLE II: Properties of Erdos-Renyi (ER) random graphs 
with N and E corresponding to the carriers in Table [I] 



ER 


<Z> 


<b> 


<c> 


r 


G(q) 


SW 


1.533 


0.0090 


0.446 


-0.065 


0.086 


US 


2.070 


0.0116 


0.118 


-0.034 


0.145 


CO 


2.101 


0.0097 


0.106 


-0.025 


0.143 


UA 


2.151 


0.0098 


0.095 


-0.040 


0.156 


AA 


1.864 


0.0074 


0.158 


-0.008 


0.109 


NW 


2.242 


0.0097 


0.091 


-0.000 


0.154 


DL 


2.103 


0.0085 


0.102 


0.012 


0.151 


NW+DL 


1.977 


0.0061 


0.116 


0.003 


0.116 


UPS 


2.134 


0.0110 


0.101 


-0.051 


0.148 


FX 


3.002 


0.0061 


0.035 


0.017 


0.190 


Agg7 


1.810 


0.0042 


0.181 


0.012 


0.090 


AggPass 


2.457 


0.0018 


0.029 


-0.006 


0.113 


AggAll 


2.51 


0.0012 


0.022 


-0.018 


0.108 



butions P(x) = ^2i> x p(x)- Figure [Tlshows the cumula- 
tive degree distributions, while Fig~[7| (in Appendix A) 
shows the cumulative strength distributions. 

For each carrier we analyze how well the empirically 
observed degree distribution can be fit by a theoretical 
distribution, considering the following forms for the the- 
oretical cumulative distribution function: 1) power law 
(pi), 2) exponential (exp), 3) stretched exponential (se), 
4) power law with exponential decay (pled), 5) cumula- 
tive log-normal (cln) distribution. We use the nonlinear 
least squares fitting routine of the R Statistical Com- 
puting platform [25] to solve for the parameters values 
for each candidate distribution which provide the best 
fit to the data. Finally, we calculate the residual sum 
of squares between these best fit candidate distributions 
and the empirical data. In almost all cases, one of the 
candidate distributions clearly minimizes this difference. 
See Table [TTT] in Appendix A. Though of course there exist 
more rigorous methods for extracting the best fit power 
law exponent to a data set [26], the airline networks an- 
alyzed herein are too far from power law distributions to 
warrant the overhead associated with such techniques. 

The theoretical distribution which best describes the 
aggregate over the seven passenger carriers (Agg7) is a 
simple exponential distribution. The aggregate over all 
passenger carriers (AggPass) is best described by a cu- 
mulative log-normal distribution. The aggregate over all 
flights flown in 2007 is best fit by a power law with ex- 
ponential tail. Furthermore, as can be seen in Fig. [1] for 
all three distinct aggregate views, the tail decays more 
sharply than exponential. The SW network, similar to 
AggPass, is best described by the cumulative log- normal 
distribution. The other six individual carriers studied 
all have networks with degree distributions that are well 
described by simple exponential distributions. However, 
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FIG. 1: Cumulative degree distribution, P(q), for each carrier shown on a log-log scale with the linear- linear view inset. Solid 
lines are the best fit theoretical distribution: cln for SW, pled for AA and CO, exp for DL, NW, and UA; pled for US and 
FX; cln for UPS; exp for Agg7; cln for AggPass; pled for AggAll. Note, the tail on each of the three different aggregate views 
decays more quickly the exponential. 
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FIG. 2: (a) Cumulative /c-core distribution, F(k), of the largest passenger carrier airline networks, selected cargo carriers, and 
three different aggregate views, (b) Comparison of F(k) for a PP airline (SW) and a HS airline (AA) to corresponding power 
law (PL) and Erdos-Renyi (ER) random graphs. For the PL and ER graphs, results are the average over 50 independent 
realizations. The most occupied shell for ER graphs is the highest (/c max ) shell, while for PL graphs it is the smallest shell (the 
1 -shell). 



unlike the aggregate views, the extreme tail events (de- 
gree 50-100) are underestimated by the exponential (ex- 
cept for AA, where exponential is an excellent fit over 
the entire range). All networks are unambiguously "right 
skewed". Exact values are given as a(q) in Table |lj The 
skew for SW is an order of magnitude less than that for 
other carriers, reflecting its distinct structure. 



K-CORE STRUCTURES 

The SW network is distinguished from the networks 
of the other carriers by the metrics of Table [TJ yet the 
difference in topology is even more pronounced when the 
k-core structures of the distinct carriers are compared. 
The k-core of the network is a subgraph constructed 
by iteratively pruning all vertices with degree less than 
k [27l [28] . For instance, starting from an original net- 
work we remove all nodes with degree q < k and their 
corresponding edges, then successively remove all nodes 
(along with their edges) which are now of degree q < k 
in the pruned network, and continue iterating until all 
remaining nodes have q > k. The remaining subgraph is 
the fc-core. We also consider the fc-shell, which consists 
of all nodes which are present in the k-core but not in the 
(k + l)-core. Likewise, the "coreness" of node i, denoted 
q, is defined as the largest value of k for which the node 
is a member of the k-core. 

The k-core decomposition is a computationally inex- 
pensive way of revealing additional details about the 
structural role of nodes beyond their degrees and has 
lately been the focus of several studies in network the- 
ory. It has been used to predict protein functions from 
protein-protein interaction networks and amino acid se- 
quences [29] and to identify the inherent layered structure 



of the protein interaction network [30]. More recently, 
the method of /c-shell decomposition has been used to ar- 
rive at a model of internet topology [31] . The additional 
structural information revealed by k-core decomposition 
has been used to generate random graphs with a speci- 
fied "fc-core fingerprint" and simulate the AS network of 
the internet [32] . 

Figure |2] (a) shows the k-core structure of all the carri- 
ers studied herein. Here F(k) is the fraction of all nodes 
with coreness greater than or equal to k. Note that for 
SW all nodes i have q > 7, and the fraction of nodes 
with large coreness decays much more slowly than for 
the other carriers. Two key differences are prominent 
when comparing the k-core structure of SW to the other 
carriers: the value of fc max and the occupancy of the fc max 
shell. For fc max , in spite of having the smallest number 
of nodes TV, SW achieves the highest fc-core, with value 
fc max = 20, and normalized value k max /N = 0.312. (The 
next largest is American Airlines (A A) with fc max = 17 
with normalized value /c max /7V = 0.140.) With respect 
to occupancy, that of the largest shell in SW is espe- 
cially remarkable, with 53% of all airports belonging to 
the & max -core. In contrast, for AA, 26% belong to the 

Figure [2] (b) compares the k-core structure of SW and 
AA to the k-core structure of similarly sized power-law 
(PL) and Erdos-Renyi (ER) random graphs (with N = 
64 for SW and N = 121 for AA). The PL curve is the 
average over an ensemble of 50 independent realizations, 
each generated by applying the configuration model to a 
degree sequence selected from a power law distribution 
with exponent 1.8 (consistent with the power law fit to 
the world-wide air-transportation network from [3J). The 
ER curve is the average over 50 independent realizations 
generated with the respective N and number of edges 
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M = 892 for SW and M = 1163 for AA. The most 
occupied shell for Erdos-Renyi random graphs is the /c max 
shell, while that of power-law random graphs is the 1- 
shell. The real networks (SW and AA) lie between their 
two corresponding random archetypes. 

When examining the occupancy of each /c-shell indi- 
vidually (i.e., the proportion of nodes in that shell), the 
individual passenger airlines differ dramatically from the 
larger aggregate views and from the cargo carrier FX. For 
all the passenger carriers the & max shell has the largest oc- 
cupancy (except UA in which the (/c max — 1) shell, which 
is the 10-shell, is the largest). Whereas, for the AggPass, 
AggAll, and FX networks the most occupied shell occurs 
for one of the smallest values of k (the 2-shell in each 
case). 

The highest core of the HS networks studied here con- 
tains that carrier's hubs and consequently its most vi- 
able transfer points, consistent with prior work suggest- 
ing that the core of a network plays a special role in 
enhancing navigability of networks where global struc- 
tural information is unavailable [33]. The large value of 
fc max for SW and the large occupancy of the /c max -shell 
suggest that there are many redundant transfer points in 
the SW network in the cases where a direct connection is 
not available between source-destination pairs. Indeed, 
while direct connections may not be available in the SW 
network, transfers can be made at numerous nodes in the 

While in the domain of the airline networks degree and 
/c-core are strongly correlated, especially for large /c, in 
general having high degree is not sufficient for inclusion 
in a high k-core (consider for instance a star graph all 
of whose nodes belong to the 1-core and no others). In- 
deed, certain networks may avoid the paradigm that hubs 
belong in the highest core. Hubs located in the core of 
a network play a different structural role than those lo- 
cated peripherally (in low /c-cores): hubs in the core pro- 
vide gains in efficiency, for instance by reducing network 
diameter, but form critical targets without which the net- 
work loses connectivity and function (consider for exam- 
ple weather disturbances that temporarily disable an HS 
airline's hubs). While peripheral hubs may not offer the 
gains in efficiency provided by core hubs, the connectivity 
of the core of the network remains unaffected when such 
hubs are disabled. A discussion alluding to this tradeoff 
of core versus peripheral hubs in the internet is found in 
Ref. [34]. 



RESILIENCE 

Of great interest is the individual passenger carrier's 
resilience to random edge deletion and targeted and ran- 
dom node deletion. Edge deletion corresponds to, for 
instance, disturbances such as weather preventing travel 
between a pair of airports (i.e., deletion of a flight path). 
Node deletion corresponds to the closure of an airport. 
There is extensive literature investigating various real 



and simulated networks' resilience to both random and 
targeted node and edge removal. Previous work found 
that simulated random power-law networks are robust 
to random node deletion but vulnerable to targeted at- 
tack [35 . Different targeted attack strategies have been 
investigated on various real and simulated networks us- 
ing a variety of metrics, notably average inverse geodesic 
distance (also called 'network efficiency') and the relative 
size of the largest connected component [36 . The robust- 
ness of graphs with various kinds of degree distributions 
have also been studied recently, e.g. in Refs. [37, 38] and 
references therein. 

To quantify the performance of the networks under the 
various deletion processes, we use two topological mea- 
sures: the size of the largest connected component (de- 
noted S) and a relative global travel cost metric (intro- 
duced below and denoted T) which includes scaled contri- 
butions accounting for both spatial (geographic) distance 
and geodesic distance (hop-count). 

The travel cost metric is defined as follows. Note, 
to include information on geographic distance, we first 
augment G C (N C , E c ) by including on each edge the ge- 
ographic length of the edge. Then for every possible 
source-destination pair (i, j), we apply Dijkstra's algo- 
rithm [39] (as implemented in NetworkX [40]) to de- 
termine the path with the shortest geographic distance 
connecting i and j. In the event that there is an edge 
directly between i and j, the shortest path is simply 
(i, j). If the path consists of a sequence of edges (de- 
note these (i, ii), (ii, 22), . . . , (i m ,j)), we calculate the to- 
tal path length dij by adding the length of the edges: 

Next we convert the geographic path length to a 'flight 
time' by dividing by a characteristic velocity (v = 500 
miles/hour). For each of the m intermediate nodes in 
the path we add a fixed 'transfer cost' of = 1.0 hour to 
account for layover time: 

Ui — — + m @- 

J v 

Finally, we can define the travel cost for the whole net- 
work or for just a subset of nodes in the network M C N c 
as the sum over all of the included path costs: 

ieM jeM 

Note, the travel cost over the entire network is T(N). 

With these two metrics (S and T) in mind, we ana- 
lyze the resilience of the carrier networks to random edge 
deletion and to random and targeted node deletion. We 
first consider random edge deletion as increasing num- 
bers of edges are deleted. For each number of deleted 
edges, we generate an ensemble of 50 randomly selected 
sets of edges to delete. Figure [3] (a) shows the results 
for S (the relative size of the largest connected compo- 
nent) as more edges are removed. Remarkably, SW has 
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FIG. 3: Two metrics illustrating the resilience of the passenger carriers: (a) Relative size of the largest connected component 
(S) of each passenger carrier's network as a function of the proportion of edges removed by random failure (r) averaged over 50 
realizations, (b) Travel cost metric Tq(m) 1S ^ ne se ^ °f n °des in the largest connected component), evaluated on the largest 
connected component of the passenger carrier's networks as a function of r averaged over 50 realizations. Representative error 
bars (with height one standard error above and below the mean) are shown in both (a) and (b) on SW and US. 
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FIG. 4: S of each passenger carrier's network as a function 
of the proportion of nodes removed by targeted attack (£). 
The node (and its edges) with the largest existing degree and 
(inset) betweenness centrality are iteratively removed at each 
step. Targeting by betweenness rather than degree causes 
more rapid breakdown of each carrier's network. The dashed 
diagonal line depicts the maximal size of S under this process 
for any network (i.e. the size of S for a complete graph). 



nearly 98% of its nodes in largest connected component 
even after the deletion of 80% of its edges (and remains 
at 100% connected for every realization in the ensemble 
until 30.8% of the edges are removed). In contrast, all of 
the other carriers have realizations that start losing full 
connectivity after the deletion of fewer than 2% of edges. 

Once some nodes are disconnected, there is no short- 
est path to any of these disconnected nodes so the travel 
cost over the whole network is formally infinite. Con- 
sequently, when calculating the travel cost we consider 



only the nodes in the largest connected component of 
the randomly damaged graph. We calculate the travel 
cost between all source-destination pairs in this subset in 
the original graph, Tq(M), and in the damaged graph, 
T(M). Limiting the cost metric to the largest connected 
component gives a measure of the efficiency of the under- 
lying structure independent of the disconnected nodes. 
Finally, we normalize the damaged travel cost through 
the largest connected component by the travel cost for 
this subset in the original network to obtain the relative 
travel cost of the damaged network T = Tq{m) • ^ ^ ms 
manner, we eliminate network size effects by comparing 
the performance of the damaged network only with the 
corresponding original network. 

As shown in Fig. [3] (a), for all of the carriers, the ma- 
jority of the network remains connected even after re- 
moving a high percentage of the edges. Yet the HS net- 
works are fragile in the sense that a small set of nodes 
can be completely disconnected from the network even in 
the low regime of edge deletion which reasonably models 
weather disturbances (< 20%). This result is consistent 
with the prevalence of low-degree nodes occupying the 
lower /c-shells in the HS networks. 

The cost metric gives a more subtle portrait of the ef- 
fect of removing edges from the carrier network, Fig. [3] 
(b). For small fraction of edges deleted all carriers have 
similar performance. Yet once more than 15% of edges 
are deleted SW exhibits the lowest ensemble-averaged 
travel cost through the largest component. Intuitively, a 
well-connected (high density) PP structure permits the 
carrier to use the majority of its nodes as viable transfer 
points between most source-destination pairs, so even if 
an edge deletion process eliminates the original shortest 
path between a pair there is likely to remain other nearly- 
shortest paths. We expect HS networks which route the 
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majority of travel paths through relatively few (3 — 5) 
hubs to perform worse under this metric since deletion 
of a nearby hub necessitates inefficient transcontinental 
crossings to the next-nearest hub in order to access the 
rest of the network. 

In addition to random edge removal, we perform both 
random and targeted node removal on the passenger car- 
rier networks. Similarly to the analysis in [36], we target 
nodes by either degree or betweenness. Figure [4] shows 
results for iterative removal of the node with highest de- 
gree, while the inset of Fig. [4j show results for iterative 
removal of the node with highest betweenness. The SW 
network stands out from the other passenger carriers, re- 
maining fully connected after removing more than 30% 
of nodes targeted by betweenness and more than 50% of 
nodes targeted by degree. We also find that all carriers 
are resilient to random node removal (not shown here). 
This does not come as a surprise, given that networks 
with right skewed degree distributions are typically im- 
mune to random failures of their nodes. 



REDESIGNING NETWORKS : K-CORE 
ENHANCING, COST PRESERVING, REWIRING 

Of great interest is understanding how to increase the 
resilience of an existing network. We show here one 
rewiring process which can increase binary edge density, 
/c-core (both in terms of the value of k max and the occu- 
pancy of larger /c-shells), and consequently resilience to 
node and edge deletion without increasing flight or air- 
port requirements. Given a set of four nodes connected 
as shown in figure [5] (a), the missing connection to form 
a 4-clique can be created by routing a small number of 
flights along the missing edge connecting nodes 1 and 
3. To preserve the gate requirements, a flight originally 
between nodes 3 and 4 is rerouted along 2 and 4. In 
this manner the total number of flights (the sum over all 
edges) and the gate requirements (the in-strength and 
out-strength of each node) remain constant. The addi- 
tion of the edge connecting 1 and 3 raises the coreness of 
at least one of these two nodes. 

To test this rewiring scheme, we extracted the 'Daily 
2-flight minimum' weighted subnetwork for each carrier 
c formed by rescaling all edge weights Sij — > |_fHJ anc ^ 
removing all edges with new weight less than 2 (each edge 
must begin with at least two daily flights in order to be 
considered for rewiring by our scheme). 

Fig. [6] shows the results for UA (chosen since it has 
a particularly shallow /c-core structure) after adding 
10,20,30 and 50 percent additional edges according to 
the flight and gate-preserving scheme. The initial net- 
work has 62 nodes and 113 edges. The rewiring scheme 
increases the original binary edge density by 9.7% upon 
adding 10% extra edges and by 18.6% upon adding 20% 
extra edges. More so, it moves lower k-core nodes into 
higher cores. Fig. [6] (a) shows the resulting enhanced k- 
core structure. Fig. (b) shows the increased resilience 




FIG. 5: Example of a strength-preserving rewiring which in- 
creases binary (unweighted) edge density and /c-core of nodes, 
(a) The initial logical weighted connectivity of a set of four 
nodes. No explicit geography is implied by this layout, (b) 
Addition of a direct link between nodes 1 and 3 with ad- 
justments of the weights on the existing links increases the 
coreness of 1 or 3 or both. The strength of each node and 
the sum over all edge weights remains constant despite the 
rewiring. 

with respect to 5, the fraction of nodes in the giant com- 
ponent. The rewired networks are also more resilient to 
random edge removal (not shown here). 

While the specific many-variable optimization prob- 
lems solved by the carriers may preclude such simple 
rewirings, this example suffices to show the existence 
of strength-preserving transformations which increase bi- 
nary edge density and consequently network resilience 
to node and edge failure. It is difficult to conceive of 
a degree preserving rewiring which enhances the k-core 
structure of networks as the degree sequence essentially 
determines the k-core structure of a network [41] [42] . 

CONCLUSION 

We have analyzed the network structures of the major 
passenger airlines in USA with an aim of furthering un- 
derstanding of how to better design and operate critical 
infrastructure. We find the degree distributions of sev- 
eral of the networks, including the aggregate views, are 
well described by simple exponential or cumulative log- 
normal distributions. We notice that the Gini coefficient 
can supplement the structural knowledge gained from de- 
gree assortativity in networks. We establish connections 
between /c-core structures of networks and how airlines 
can be well-connected and resilient to disturbances as di- 
verse as random and targeted attacks or failures, by both 
degree and betweenness. Similar results hold whether the 
disturbance involves airports or flight paths. Further- 
more, using a travel time heuristic introduced herein, we 
find that connectivity can be maintained without much 
increase in travel time, despite significant disruption. 

Of the seven largest USA passenger air carriers, espe- 
cially remarkable is Southwest Airlines, where more than 
half of all nodes belong to the & max -core leading to ex- 
treme resilience for that network. We present a simple 
rewiring scheme which can help existing airlines approach 
such /c-core structures, without increase of flight or air- 
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FIG. 6: Applying the rewiring of Fig. [5] to the daily 2-flight minimum UA network increases /c-core structures and resilience 
to degree-targeted node deletion (removed exactly as in Fig. |4|. (a) Cumulative /c-core distribution, F(K), of the increasingly 
rewired UA network, along the SW daily 2-flight minimum for reference, (b) S as a function of t for the networks in (a). 



craft requirements. 

Our findings on k-core and resilience suggest that hier- 
archical networks could be especially susceptible to tar- 
geted attacks or failures, given the rare population of the 
highest k-cores of such networks. The future design and 
operation of critical infrastructure may benefit from an- 
alyzing the tradeoffs of core versus peripheral placement 
of hub nodes. Hubs located in the core of a network sub- 
stantially increase efficient connectivity yet are critical 
targets as without them, the network loses connectivity. 
Hubs in the periphery (low /c-cores) offer smaller benefits 
with respect to efficient connections, yet if they are dis- 
abled the connectivity of the core of the network remains 
largely unaffected. 

It is worth noting that the effect of targeted attack by 
betweenness, rather than by degree, is significantly more 



pronounced on each carrier's network. This complements 
previous studies on the importance of betweenness in the 
World Airline Network [5 j and suggests that betweenness 
is an important criterion for consideration in critical in- 
frastructure networks. 

Our findings may be applicable to the design and 
operation a range of infrastructure networks, for in- 
stance other transportation networks, power-grids, and 
and telecommunication networks. In addition, includ- 
ing analysis of resilience properties of networks would 
augment current studies on the optimal distribution of 
resources or facilities in a given geographical area. The 
analysis herein is a first step. Real weather would cor- 
respond to correlated edge deletion, not random. More- 
over we neglect scheduling and restrict ourselves to the 
domestic routes of some international carriers. 
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APPENDIX A: DATA FITTING 

We analyze the statistical distribution which best fits 
the empirical data of the complementary cumulative de- 
gree distribution, P(q) = ^2i> q p(q), f° r each individ- 
ual airline carrier's network, choosing amongst: 1) power 
law, 2) exponential, 3) stretched exponential, 4) power 
law with exponential decay, 5) cumulative log- normal dis- 
tribution. We use the nonlinear least squares fitting rou- 
tine of the R Statistical Computing platform [25] to solve 
for the parameters values for each candidate distribution 
which provide the best fit to the data. Finally, we cal- 
culate the residual sum of squares between these best fit 
candidate distributions and the empirical data. The ex- 



plicit values are given in Table III The smallest value 
of the residual sum of squares over all the distributions 
is highlighted in red. If multiple candidate functions do 
equally well up to the first two significant figures, both 
are highlighted in red. In all these cases it is the EXP 
and PLED which perform equally well (and the "power 
law" portion of the PLED has exponent quite close to 
zero). 



TABLE III: Residual sum of squares of the best fit func- 
tional form to the empirical airline data, choosing amongst 
power law (PL), exponential (EXP), stretched exponential 
(SE), power law with exponential decay (PLED) and the cu- 
mulative of a log-normal (CLN). 



Source 


PL 


EXP 


SE 


PLED 


CLN 


AggAll 


0.7723 


0.4222 


7.516 


0.03428 


0.04669 


AggPass 


0.8448 


0.3540 


7.226 


0.06245 


0.05677 


Agg7 


2.169 


0.1066 


15.32 


0.1047 


0.3098 


SW 


3.102 


0.5565 


17.94 


0.1569 


0.03924 


AA 


1.599 


0.04187 


7.875 


0.03506 


0.1126 


CO 


0.6395 


0.02408 


3.788 


0.01919 


0.03244 


DL 


0.8928 


0.01375 


4.431 


0.01361 


0.02473 


NW 


0.632 


0.04955 


3.558 


0.04939 


0.05697 


UA 


0.6775 


0.03523 


3.752 


0.03515 


0.03662 


US 


0.5079 


0.04065 


3.305 


0.03742 


0.05414 


FX 


0.2517 


0.05657 


1.568 


0.02053 


0.0382 


UPS 


0.8909 


0.06618 


3.941 


0.04466 


0.02802 
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FIG. 7: Cumulative strength distribution, P(s), for each carrier shown on a log-log scale. These distributions span several 
decades of range. In contrast, the corresponding cumulative degree distributions (shown in Fig. [I]) terminate orders of magnitude 
earlier. SW has no nodes with s < 2000, hence its plot begins with that value. 



